The double-edged Sword of big data in healthcare

by Haolun Xu

Words: 1,025 | Reading time: 10 minutes

“The value of big data isn’t the data, it’s the narrative.”

-Dr Kristian J Hammond, Professor of computer science at Northwestern University, USA

We live in a world interwoven by a growing amount of processed data. From the global average fertility rate of 2.3 children per woman in 2020, to 676 million of the world’s population having been infected with Covid-19, insight from data helps us to make sense of the world around us. However, have you ever wondered how these numbers and data are generated and what happens when they are grouped into different categories? The answer is big data.

What is big data?

“Generally speaking, big data is a type of basic, unfiltered, and unprocessed information in very large quantities, which undergoes a certain treatment using computer algorithms. Once the data is processed in this way, it can be used to give us answers to specific and complex questions,” explains Professor Angelos Stefanidis, Dean of the School of AI and Advanced Computing at XJTLU Entrepreneur College (Taicang).

In more technical terms, big data is a vast collection (volume) of valuable data combined in various forms, characterised by features such as velocity and veracity. These are also known as the five Vs of big data – volume, variety, velocity, value, and veracity.

The Five V's

The Five Vs are the features that describe big data as we begin generating quality information from raw data for an array of applications.

Click or tap on each card to reveal more information about the 5 V's of big data.

Volume

Big data refers to a vast volume of data generated by the interaction between users and electronic devices, such as data feeds from mobile apps and page views on a website.

Velocity

Velocity refers to the speed with which data is generated and how quickly it can analysed.

Variety

Data comes in many forms, such as dates, names, numbers, audio, and videos.

Veracity

Veracity describes the quality, accuracy or truthfulness of the data.

Value

The value of the data lies in whether or not a set of data can be explained and then becomes an evidence-based decision-making tool.

The five Vs are the features that describe big data as we begin generating quality information from raw data for an array of applications. Healthcare is an area where big data plays a significant role in improving efficiency.

Big data in healthcare

Big data in healthcare has the potential to revolutionize patient care, research, and operational efficiency. Some of the key benefits include:Big data has become a valuable tool for making more efficient medical diagnoses, among other things. With the help of artificial intelligence (AI) machine learning technology, computers collect patient data, such as age, gender, medical history, recent symptoms, and other relevant information. Then, algorithms are applied to identify hidden or common patterns in the data that medical professionals can further analyse to provide tailored therapeutic solutions.

““With an abundance of high-quality medical data and sophisticated AI algorithms at our disposal, medical doctors now have a lot more confidence in making medical diagnoses aided by AI. ”

-Professor Angelos Stefanidis, Dean of the School of AI and Advanced Computing at XJTLU Entrepreneur College (Taicang)

An example of using AI on big data to improve medical diagnoses can be seen in the case of a PhD student, Sifan Song, from XJTLU and his co-researchers. They have developed a deep learning model, “ViT-Patch GAN”, to improve the quality of chromosome straightening by processing data. Chromosomes have varying degrees of curvature, so straightening them makes the banding information on the chromosome easier to read. This is an essential step for diagnosing various genetic disorders, such as Klinefelter syndrome and certain specific cancers.

The advancement of AI-aided medical diagnosis relies on the amount of healthcare data generated that can then be processed. According to a 2018 Statista report, it is estimated that 2,314 exabytes of new data would be generated in 2020, approximately 15 times the volume generated in 2013.

To give an idea of just how large 2,314 exabytes is, an iPhone 14 Pro Max can hold roughly one terabyte of data. One exabyte equals one million terabytes, so 2,314 exabytes are equivalent to a pile of 2.4 billion iPhone 14 Pro Maxes.

With vast amounts of data being generated so quickly, often containing sensitive information about patients and their families, should we be worried about our personal data being utilised without our consent or in an inappropriate way?

Data vs privacy

“The practice of having patients’ consent for personal data usage exists in most countries around the world,” Professor Stefanidis says. "Medical professionals are ethically and legally bound to only obtain consent from those patients whose private health data would be used as part of a particular medical research project or to devise a medical treatment. China is a good example of a country with very strong personal data protection regulations.”

Nevertheless, a conflict of interest can arise when we try to balance maintaining data privacy by imposing ethical barriers or legal restrictions with an organisation’s need to process private patient data.

“We have to find the right balance and make sure that the AI solutions we develop respect the individual and their privacy and give people the right to determine how their data is used. ”

-Professor Angelos Stefanidis, Dean of the School of AI and Advanced Computing at XJTLU Entrepreneur College (Taicang)

“We have to find the right balance and make sure that the AI solutions we develop respect the individual and their privacy and give people the right to determine how their data is used,” Professor Stefanidis added.

In a Forbes article, Nick Culbertson highlights two cases to discuss the vulnerability of data privacy in healthcare. One example describes how hackers stole patient data to turn a profit on the black market or force large organisations to make ransom payments. In the other case, a then-patient care technician from a hospital in Iowa, USA, illegally accessed her ex-boyfriend’s medical records and used them to blackmail him.

Despite the privacy issues that need careful consideration, big data fosters connectivity between healthcare organisations that enables the advancement of medical science. During the Covid-19 pandemic, organisations such as the World Health Organisation (WHO) and the Center for Systems Science and Engineering at Johns Hopkins University used data-sharing platforms to accelerate data sharing and research into the development of vaccines. The platforms improved understanding of Covid-19 and helped to find therapeutic strategies. Yet, it is clear that data privacy laws must be constantly reviewed to ensure that developments in advancing technology are matched with safeguards for everyone’s protection.

Join the Conversation

As time goes by, big data and healthcare become more interdependent. Are you worried about your healthcare data being misused, and do you have a solution to guard your data privacy? Have your say #LightandWingsMag on social media platforms.